An Analysis of Clarification Dialogue for Question Answering

نویسندگان

  • Marco De Boni
  • Suresh Manandhar
چکیده

We examine clarification dialogue, a mechanism for refining user questions with follow-up questions, in the context of open domain Question Answering systems. We develop an algorithm for clarification dialogue recognition through the analysis of collected data on clarification dialogues and examine the importance of clarification dialogue recognition for question answering. The algorithm is evaluated and shown to successfully recognize the occurrence of clarification dialogue in the majority of cases and to simplify the task of answer retrieval. 1 Clarification dialogues in Question Answering Question Answering Systems aim to determine an answer to a question by searching for a response in a collection of documents (see Voorhees 2002 for an overview of current systems). In order to achieve this (see for example Harabagiu et al. 2002), systems narrow down the search by using information retrieval techniques to select a subset of documents, or paragraphs within documents, containing keywords from the question and a concept which corresponds to the correct question type (e.g. a question starting with the word “Who?” would require an answer containing a person). The exact answer sentence is then sought by either attempting to unify the answer semantically with the question, through some kind of logical transformation (e.g. Moldovan and Rus 2001) or by some form of pattern matching (e.g. Soubbotin 2002; Harabagiu et al. 1999). Often, though, a single question is not enough to meet user’s goals and an elaboration or clarification dialogue is required, i.e. a dialogue with the user which would enable the answering system to refine its understanding of the questioner's needs (for reasons of space we shall not investigate here the difference between elaboration dialogues, clarification dialogues and coherent topical subdialogues and we shall hence refer to this type of dialogue simply as “clarification dialogue”, noting that this may not be entirely satisfactory from a theoretical linguistic point of view). While a number of researchers have looked at clarification dialogue from a theoretical point of view (e.g. Ginzburg 1998; Ginzburg and Sag 2000; van Beek at al. 1993), or from the point of view of task oriented dialogue within a narrow domain (e.g. Ardissono and Sestero 1996), we are not aware of any work on clarification dialogue for open domain question answering systems such as the ones presented at the TREC workshops, apart from the experiments carried out for the (subsequently abandoned) “context” task in the TREC-10 QA workshop (Voorhees 2002; Harabagiu et al. 2002). Here we seek to partially address this problem by looking at some particular aspect of clarification dialogues in the context of open domain question answering. In particular, we examine the problem of recognizing that a clarification dialogue is occurring, i.e. how to recognize that the current question under consideration is part of a previous series (i.e. clarifying previous questions) or the start of a new series; we then show how the recognition that a clarification dialogue is occurring can simplify the problem of answer retrieval. 2 The TREC Context Experiments The TREC-2001 QA track included a "context" task which aimed at testing systems' ability to track context through a series of questions (Voorhees 2002). In other words, systems were required to respond correctly to a kind of clarification dialogue in which a full understanding of questions depended on an understanding of previous questions. In order to test the ability to answer such questions correctly, a total of 42 questions were prepared by NIST staff, divided into 10 series of related question sentences which therefore constituted a type of clarification dialogue; the sentences varied in length between 3 and 8 questions, with an average of 4 questions per dialogue. These clarification dialogues were however presented to the question answering systems already classified and hence systems did not need to recognize that clarification was actually taking place. Consequently systems that simply looked for an answer in the subset of documents retrieved for the first question in a series performed well without any understanding of the fact that the questions constituted a coherent series. In a more realistic approach, systems would not be informed in advance of the start and end of a series of clarification questions and would not be able to use this information to limit the subset of documents in which an answer is to be sought. 3 Analysis of the TREC context questions We manually analysed the TREC context question collection in order to determine what features could be used to determine the start and end of a question series, with the following conclusions: • Pronouns and possessive adjectives: questions such as “When was it born?”, which followed “What was the first transgenic mammal?”, were referring to some previously mentioned object through a pronoun (“it”). The use of personal pronouns (“he”, “it”, ...) and possessive adjectives (“his”, “her”,...) which did not have any referent in the question under consideration was therefore considered an indication of a clarification question.. • Absence of verbs: questions such as “On what body of water?” clearly referred to some previous question or answer. • Repetition of proper nouns: the question series starting with “What type of vessel was the modern Varyag?” had a follow-up question “How long was the Varyag?”, where the repetition of the proper noun indicates that the same subject matter is under investigation. • Importance of semantic relations: the first question series started with the question “Which museum in Florence was damaged by a major bomb explosion?”; follow-up questions included “How many people were killed?” and “How much explosive was used?”, where there is a clear semantic relation between the “explosion” of the initial question and the “killing” and “explosive” of the following questions. Questions belonging to a series were “about” the same subject, and this aboutness could be seen in the use of semantically related words. 4 Experiments in Clarification Dialogue Recognition It was therefore speculated that an algorithm which made use of these features would successfully recognize the occurrence of clarification dialogue. Given that the only available data was the collection of “context” questions used in TREC-10, it was felt necessary to collect further data in order to test our algorithm rigorously. This was necessary both because of the small number of questions in the TREC data and the fact that there was no guarantee that an algorithm built for this dataset would perform well on “real” user questions. A collection of 253 questions was therefore put together by asking potential users to seek information on a particular topic by asking a prototype question answering system a series of questions, with “cue” questions derived from the TREC question collection given as starting points for the dialogues. These questions made up 24 clarification dialogues, varying in length from 3 questions to 23, with an average length of 12 questions (the data is available from the main author upon request). The differences between the TREC “context” collection and the new collection are summarized in the following table: Groups Qs Av. len Max Min

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Implementing clarification dialogues in open domain question answering

We examine the implementation of clarification dialogues, a mechanism for ensuring that question answering systems take into account user goals by allowing them to ask series of related questions either by refining or expanding on previous questions with follow-up questions, in the context of open domain Question Answering systems. We develop an algorithm for clarification dialogue recognition ...

متن کامل

Answering Clarification Questions

This paper describes the results of corpus and experimental investigation into the factors that affect the way clarification questions in dialogue are interpreted, and the way they are responded to. We present some results from an investigation using the BNC which show some general correlations between clarification request type, likelihood of answering, answer type and distance between questio...

متن کامل

A Chatbot-based Interactive Question Answering System

Interactive question answering (QA) systems, where a dialogue interface enables followup and clarification questions, are a recent field of research. We report our experience on the design, implementation and evaluation of a chatbot-based dialogue interface for our open-domain QA system, showing that chatbots can be effective in supporting interactive QA.

متن کامل

Does this answer your Question? Towards Dialogue Management for Restricted Domain Question Answering Systems

The main problem when going from taskoriented dialogue systems to interactive restricted-domain question answering systems is that the lack of task structure prohibits making simplifying assumptions as in task-oriented dialogue systems. In order to address this issue, we propose a solution that combines representations based on keywords extracted from the user utterances with machine learning t...

متن کامل

Coherence of Off-topic Responses for a Virtual Character

We demonstrate three classes of off-topic responses which allow a virtual question-answering character to handle cases where it does not understand the user’s input: ask for clarification, indicate misunderstanding, and move on with the conversation. While falling short of full dialogue management, a combination of such responses together with prompts to change the topic can improve overall dia...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003